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We study the non-steady relaxation of a driven one-dimensional elastic interface at the depin- 
ning transition by extensive numerical simulations concurrently implemented on graphics process- 
ing units (GPUs). We compute the time-dependent velocity and roughness as the interface relaxes 
from a flat initial configuration at the thermodynamic random-manifold critical force. Above a first, 
non-universal microscopic time-regime, we find a non-trivial long crossover towards the non-steady 
macroscopic critical regime. This "mesoscopic" time-regime is robust under changes of the micro- 
scopic disorder including its random-bond or random-field character, and can be fairly described 
as power-law corrections to the asymptotic scaling forms yielding the true critical exponents. In 
order to avoid fitting effective exponents with a systematic bias we implement a practical criterion 
of consistency and perform large-scale (L ~ 2 25 ) simulations for the non-steady dynamics of the 
continuum displacement quenched Edwards Wilkinson equation, getting accurate and consistent 
depinning exponents for this class: /3 = 0.245 ± 0.006, z = 1.433 ± 0.007, C = 1-250 ± 0.005 and 
v = 1.333 ± 0.007. Our study may explain numerical discrepancies (as large as 30% for the veloc- 
ity exponent (3) found in the literature. It might be also relevant for the analysis of experimental 
protocols with driven interfaces keeping a long-term memory of the initial condition. 

PACS numbers: 74.25.Qt, 64.60.Ht, 75.60.Ch, 05.70.Ln 



I. INTRODUCTION 



Universal dynamical behavior in disordered systems 
is usually difficult to detect and characterize, compared 
with that of pure systems. Characterizing it in strongly 
driven disordered systems thus a priori appears much 
more challenging, as we can not even rely on equilibrium 
statistical mechanics. The depinning transition of an 
elastic interface driven in a random medium is a paradig- 
matic example of such non-trivial phenomena [l|, yj. 
From a purely theoretically point of view, the elastic de- 
pinning transition of an interface can be regarded as a toy 
model for the critical irreversible dynamics of more com- 
plex strongly driven disordered systems such as "dissi- 
pative glasses" [|| whose coarse-grained dynamics can be 
highly non-trivial due to the possibility of dynamically in- 
duced anisotropies or topological defects. Indeed, a con- 
siderable progress has been done in the last years thanks 
to a fruitful interplay between the very powerful analyti- 
cal [||-[6| and numerical techniques that can be and were 
specially developed for studying equilibrium 0, H| , depin- 
ning [9l-ll3|. and creep [l4|[l5[ of strictly elastic manifolds. 
From the experimental viewpoint, on the other hand, the 
understanding of this particular problem is directly rele- 
vant for various experimental situations where the elastic 
aproximation is well met, such as magnetic [il^ - flll ] or 
ferroelectric domain walls I20H23I ]. contact lines in wet- 
ting [H, [25|] , fractures [H, |27| . It has been also useful for 
making a connection between laboratory friction experi- 
ments and the observed spatial and temporal earthquake 
clustering [28| . Describing quantitatively the apparent 
universal behavior in this variety of experimental situ- 
ations, continuously motivates the theoretical study of 
new observables and the development of new theoretical 
methods and techniques, like the ones proposed here. 



The physics of an elastic interface in a random medium 
is controlled by the competition between quenched dis- 
order (induced by the presence of impurities in the host 
material), which promotes the wandering of the elastic 
object, and the elastic forces, which tend to make the 
object flat. One of the most dramatic and worth un- 
derstanding manifestations of this competition is the re- 
sponse of these systems to an external drive, leading us to 
the above mentioned depinning transition phenomenon. 
In the absence of an external drive, the ground state 
of the system is disordered but well characterized by a 
sclf-affinc rough geometry with a diverging typical width 
w L^, where L is the linear size of the elastic object 
and Ccq is the equilibrium roughness exponent. When the 
external force is increased from zero, the ground state be- 
comes unstable and the interface is locked in mctastablc 
states. To overcome the barriers separating them and 
reach a finite steady-state velocity v it is necessary to 
exceed a finite critical force, above which barriers dis- 
appear and no metastable states exist. For directed d- 
dimensional elastic interfaces with convex elastic ener- 
gies in a D = d + 1 dimensional finite space with dis- 
order, the critical point is unique, characterized by the 
critical force f = f c and its associated critical configu- 
ration [29|, [3(|. This critical configuration is also rough 
and self-affine such that w ~ L 1 ^ with £ the depinning 
roughness exponent. When approaching the threshold 
from above, the steady-state average velocity vanishes 
like v ~ (/ — f c ) and the correlation length character- 
izing the cooperative avalanche-like motion diverges as 
£ ~ (/ — f c )~ L/ for / > f c with a typical diverging inter- 
event time where f3 is the velocity exponent, v is the 
depinning correlation length exponent, and z the dynam- 
ical exponent 0, l3lT[33j . At finite temperature and for 
/ *C /c, the system presents an ultra-slow steady-state 
creep motion with universal features dUl HI directly 
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correlated with geometrical crossovers m. At very 
small temperatures the monotonous increase of the corre- 
lation length with decreasing / below f c shows that the 
naive analogy breaks, and that depinning must be re- 
garded as a non-standard phase transition [l4|, EH ■ The 
transition is then smeared-out, with the velocity vanish- 
ing as v ~ TV exactly at / = L, with i/j, the so-called 
thermal rounding exponent 37i - 4l| . 



The non- stationary dynamics at depinning is a differ- 
ent and interesting manifestation of the competition be- 
tween elasticity and disorder, but it has received con- 
siderable less attention than the steady-state dynamics. 
Near the threshold the time needed to reach the driven 
non-equilibrium steady-state can be very long, since the 
memory of the initial condition persists for length scales 
larger than a growing correlation length £(t) ~ t x l z . 
Being only limited by the divergent steady correlation 
length £ and the system size L, the resulting non-steady 
critical regime is macroscopically large, t < £ Z ,L Z . It 
is hence relevant for experimental protocols, where it is 
in general difficult to assure history independence, as the 
memory of the initial condition is only erased by this 
slow transient process. Analogously to non-driven sys- 
tems relaxing to their critical equilibrium states |42| , the 
transient dynamics of a driven disordered system displays 
interesting, though different, universal features (43^ . 

The non-stationary dynamics is also of interest from 
the practical point of view, as it provides a (numerical 
and even experimental) method for accessing the critical 
depinning exponents of the depinning transition [HI, |4j| . 
The so called "short-time dynamics" (STD) method, 
originally restricted to standard equilibrium phase tran- 
sitions, has been applied to a variety of far from equi- 
librium situations in the last years [461 ]. For the Ising 
model, the non-steady relaxation of the magnetization at 
the critical temperature T c from a perfectly ordered ini- 
tial state was shown to obey the dynamical scaling form 
m(t) ~ t~^l l,z , with j3, v and z equilibrium exponents, 
such that the equilibrium magnetization displays a mean 
value m eq ~ (T-T c ) /3 a correlation lenght £ ~ (T-T c )-", 
and a correlation time £ 2 . If we think the velocity v as 
the order parameter of the depinning transition, analo- 
gous to the magnetization, and the driving force / as the 
control parameter, analogous to the temperature, we are 
led to propose v(t) ~ t~&' vz at / = f c , with /3, v and z 
the above mentioned out-of-equilibrium steady-state de- 
pinning exponents. The general scaling forms for the 
velocity decay and for the structure factor growth (ex- 
actly at. and) around the critical force, were confirmed 
in Ref. [44j|. Two-time critical aging relations [47[ were 
confirmed in Ref. [43[ as well. 

The critical exponents from these works were shown 
to be roughly consistent with those of Ref. [48| , but not 
directly fitted. The advantage of the STD method is that 
there is no need to reach the steady-state, but only the 
macroscopic time regime where the scaling form is satis- 
fied, expected after a first relatively short non- universal 
regime. The method is also "free" of finite size effects 



provided that the growing dynamical correlation length 
remains much smaller than the system size. We need 
however a very fine tunning to the critical force for large 
systems, in order to have a decent power-law fitting re- 
gion. Scaling corrections are sometimes also introduced 
in the fit, but in most of the cases they are purely em- 
pirical JJjj . Finally, usually a delicate analysis must be 
performed in order to extract the value of each of the 
critical exponents separately and accurately. 

Regarding the critical exponents it is worth to mention 
that obtaining accurate values for the zero-temperature 
depinning transition has been difficult, even in simple 
disordered elastic models, and large discrepancies (of the 
order of 30%) are still found in the literature. Here we 
focus on the paradigmatic quenched Edwards- Wilkinson 
model (QEW), which belongs to the so-called "short- 
range isotropic" universality class. Analytically, it can be 
tackled by the functional renormalization group (FRG) 
method. In this context it has been shown that a phys- 
ically consistent theory must involve at least two loops 
calculations @. The FRG method is however restricted 
to an cpsilon expansion around the upper critical dimen- 
sion d = 4, and its accuracy is thus a priori questionable 
for the extreme -yet experimentally relevant- case of one- 
dimensional extended interfaces. Numerically, on other 
hand, an efficient algorithm allows us to exactly deter- 
mine the critical configuration and force for an elastic in- 
terface in a given disorder landscape in a finite space with 
periodic boundary conditions (i.e. in a "torus") [TTI. fl2|. 
With this method, which has the great advantage that 
it avoids critical slowing down, we can get very precise 
values for the roughness exponent £ and v. Since the ve- 
locity scales as the typical displacement divided by the 
typical time, v ~ £^ -z , given £ ~ (/ — / c ) _l/ and ex- 
pecting v ~ (/ — /c)* 3 , we have by simple dimensional 
analysis fj = v(z — £). On the other hand, by virtue of 
the QEW model statistical tilt-symmetry [49[ we have 
v = 1/(2 — £). Therefore the QEW universality class 
can be described by only two independent exponents. In 
sharp contrast with £ (and thus v), which can be deter- 
mined from exact and efficient algorithms [TTI . [l2j , an ac- 
curate numerical determination of /3 (or z) has remained 
a challenge however, even for one dimensional interfaces. 

Direct molecular dynamics simulations around the 
sample critical force requires, first, a proper equilibra- 
tion to study the steady-state motion; taking a time of 
order L z in samples of size L x M, with M ~ i>. After 
this time, all initial conditions in a given sample coincide 
on a unique curve parametrized by the interface center of 
mass position [43j . If M is very large compared with the 
typical width of the interface (which scales as Z>), as in 
force-driven long-time steady-state simulations with "dy- 
namically generated" quenched disorder [37| . one has to 
be cautious. Since the critical force, being an extreme 
value random variable UK , slowly grows with M for 
a fixed finite L. On the other hand, if periodic bound- 
ary conditions are used, and M is small compared to the 
expected random- manifold width, i.e., M <C Z>, we can 
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have several crossovers from the random-manifold to the 
random-periodic universality class. In this respect one 
should be warned that, for a fixed L, the crossovers to 
the random-periodic class are not only controlled by M, 
but also by the velocity [H, [53|. Even if we are able 
to avoid such "random-periodic" effects by an optimal 
sample aspect-ratio choice near the threshold, one has to 
deal with size-dependent sample to sample fluctuations 
obscuring the critical region. One procedure (48j is to 
exploit the algorithm of Ref . HI I and then simulate 
the slow steady-state dynamics at a controlled difference 
from the exact critical force of each sample, so as to re- 
duce the effects of the sample to sample fluctuations of 
the critical force. This represents an important improve- 
ment over previous numerical approaches, but it is yet 
unclear weather one should disorder-average the sample 
velocity directly or, instead, a reduced velocity, in order 
to better approximate the thermodynamic velocity-force 
curve we seek. Finite-size effects also induce a subtle di- 
mensional crossover at low velocity [H, |4§[ that strongly 
reduce the size of the critical region, making it difficult 
to fit exponents accurately 

To avoid some of the ambiguities of working with force- 
driven systems it has been recently proposed to work 
in a "velocity-driven ensemble" , by pulling the interface 
with a spring of a vanishing constant m. Although this 
is an old approach, well known from friction numerical 
studies in particular, it has become very attractive both 
for numerics [5414571 ] and experiments (25j | recently; since 
FRG calculations have shown [58| that this approach al- 
lows, for a quasistatic drive, to easily get the central 
object of the effective dynamical action: the complete 
shape of the renormalized pinning force correlator. With 
this numerical method, whose equivalence with fixed- 
driving force simulations was demonstrated in Ref. [l3j |. 
it was explicitly shown that, at depinning, both random- 
bond and random-field microscopic disorder flows to- 
wards a random-field effective disorder (56[, at variance 
with the statics, where they belong to different univer- 
sality classes [55[ . At depinning this method also allows 
to get the thermodynamic critical force through scaling, 
and also the avalanche size distribution [57} • The expo- 
nents 8 or z, however, have not yet been obtained with 
this method. 

Purely discrete models with an approximate micro- 
scopic dynamics, belonging to the short-range or long- 
range elasticity continuum models universality classes, 
have been also considered 0, HH, HI]. These models 
have the advantage of being computationally very conve- 
nient, and allow to calculate the critical exponents. It is 
not clear, however, whether the approximate or "coarse 
grained" dynamical rules used in these models can change 
the interesting features of the non-steady relaxation that 
we discuss here for the continuous QEW equation, spe- 
cially at intermediate times before reaching the assymp- 
totic non-stationay regime. 

At any rate, the values obtained for the critical expo- 
nents with all the above mentioned variety of numerical 



and analytical methods are surprisingly, very disperse. 
The worst case (30% dispersion) is found for the velocity- 
force exponent 8, ranging roughly from 0.25 ± 0.05 to 

0.35 ± 0.05 @, i, ee m h, m . 

In this paper we study the non-steady relaxation of 
an elastic string in a random medium by extensive nu- 
merical simulations, extending the study of Ref. pij ]. 
We first show that the thermodynamic critical force, al- 
though strictly non-universal, can be defined unambigu- 
ously and has a unique value; this is the force that drives 
the system into the STD scaling form yielding the critical 
exponents. Second, we study corrections to the dynam- 
ical scaling form which appear at intermediate times or 
"mesoscopic" time-regime, but well above the expected 
non-universal microscopic time regime set by the micro- 
scopic disorder correlation range. We show that these 
corrections are rather robust under changes of the micro- 
scopic disorder, including its random-bond or random- 
field character. Once done this, we implement a practical 
criterion for separating these corrections and exploit the 
parallelism of GPU computing to perform, to the best of 
our knowledge, the largest-to-date molecular dynamics 
simulations of an clastic line in a random-medium de- 
scribed by a continuous QEW equation. This allows us 
to achieve long times in the non-steady critical regime 
and get, with the STD method, un-biased and precise 
estimates for the critical exponents for this depinning 
universality class: 8 = 0.245 ± 0.006, z = 1.433 ± 0.007, 
C = 1.250 ± 0.005 and v = 1.333 ± 0.007. 



Organization of the paper 

In SecJTT] we define the model and quantities of inter- 
est. In Scc lIII Al we describe the general scaling forms 
for the non-steady critical relaxation that are used for 
the analysis. In Sec lIIIBI we briefly discuss the numer- 
ical methods, leaving all computational details for the 
Appendix „ In Sec lIVI we show all the results. We first 
analyze the geometry of the critical configuration and 
the thermodynamic limit of the critical force in Sec II V Al 
Then, we study the non-steady relaxation of different ob- 
servables in Sec II V Bl We characterize in Sec lIV Cl the 
deviations from the expected scaling forms of Sec lIII Al 
and in Sec lIV Dl we analyze in detail these corrections 
to scaling, which explain the existence of biased effective 
exponents in the problem. In Sec[V]we sumarize and dis- 
cuss all the results and in Sec lVIl we conclude and give 
perspectives. 



II. MODELS, PROTOCOLS AND 
OBSERVABLES 

We study a one-dimensional elastic line described by 
a single valued function u(x,t). Here, u(x,t) measures 
the line instantaneous transverse displacement from the 
x axis at time t. In the continuum, the overdamped 
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interface dynamics is described by the QEW equation 
of motion, 



r)d t u(x, t) = cd^.u(x, t) + F p (u, x) + f 



(1) 



where / is an uniformly applied external force and 
F p (u,x) is the random pinning force. Without lack of 
generality we can set friction coefficient rj = 1 and the 
clastic constant c = 1. The pinning force has zero average 
and correlator 



F p (u, x)F p (u', x') = A(u - u')6(x - x'). 



(2) 



The overline represents the average on the disorder re- 
alizations and A(x) is a short-ranged function, of range 
Tf = 1. We consider two cases for A(x). In the so- 
called random bond (RB) case the elastic line moves in a 
random potential such that F p (u,x) = —d u ll(u,x) with 
U(u,x) bounded, and thus J A(u, x) = 0. In the ran- 
dom field (RF) case, the random potential U(u, x) is un- 
bounded and diffuses as a function of u, with diffusion 
constant J u A(u,x) > Q. In turn, the pinning poten- 
tial or forces can be sampled from different distributions. 
Here we consider Gaussian and Uniform (constant) dis- 
tributions. Analyzing all the above cases separately will 
allow us to detect possibles departures from the expected 
universal behavior and corrections to scaling at relatively 
short time and length scales. 

It is a convenient and safe procedure to discretize the 
interface displacement in the x-direction, keeping u(x,t) 
as a continuum variable. Doing so we define the center 
of mass velocity for an interface of size L as, 



L-1 



v(t) = ^ d t u(x,t) 



(3) 



2=0 



that, given Eq. (TTJ) and 77 = c = 1, is nothing else but 
the spatial average of the instantaneous total forces act- 
ing on the line. From the instantaneous center of mass 
displacement, 



L-1 



i(t) = j^2u(x,t), 



(4) 



we can define the instantaneous quadratic width of the 
interface as, 



L-1 



(*) = 7 Z>0M) -«(*)]' 



(5) 



x=0 



The geometrical properties of the line as a function of 
length-scale can be conveniently described using the av- 
eraged structure factor 



s q (t) 



1 i_1 

- ^ u(x,t)e- lc t x 



(6) 



Wc are interested in the time-dependence of the above 
disorder-averaged observables with the elastic line ini- 
tially flat, i.e. u(x, t = 0) = 0. Other, more complex 
initial conditions may be also considered, but they do 
not provide more information on the dynamics of relax- 
ation as far as they are not correlated with the random 
potential. Choosing an initially flat line allow us to easily 
detect the existence of memory depending on the length- 
scale, since correlations on this system always develop 
a roughness with self-affine length regimes described by 
positive exponents. For a globally self affine line we have 
S(q) ~ <7~( 1+2< »), thus yielding the roughness exponent 
(. Such behavior is expected to hold at the so-called 
random-manifold regime. This is, small q compared to 
both the Larkin wavevector q c ~ £~ , [with w(£ c ) ~ r/] 
and the discretization wavevector <~~j 1. For our pa- 
rameters, we have q c ~ q c i of order O(l). 



III. METHODS 

We describe here the scaling forms expected for the 
non-stationary relaxation of an elastic line in a ran- 
dom medium at large times, and introduce the numerical 
methods used to simulate the dynamics. 



A. Universal Non-steady relaxation 

It was originally shown [60l ] using Renormalization 
Group techniques for a Type A model, and then widely 
extended empirically to other models, that the short-time 
behavior of the order parameter's n-th moment close to 
a critical point follows the universal scaling relation 

m( n) (t, h, L, m ) = b- nf3/u m^ {b~ z t, b 1/u h, b^L, Wmo). 

(7) 

Here t is the time, h = (H — H c )/H c is the reduced 
driving field of the transition associated with the order 
parameter (such as magnetic field, temperature, etc.), L 
is the system size, b is a scaling factor, mo is the order 
parameter initial value, fi is an universal exponent as- 
sociated with the short time behavior, while /?, v and z 
are the usual critical and dynamical exponents. This re- 
lation was analytically shown to be valid in the limit of 
small initial order parameter mo <C 1, with t larger that 
some non-universal microscopic time and smaller than 
the equilibration time T eq ~ L z . Nevertheless, in agree- 

and ana- 
631, it was 



where q — 2nn/ L, with n = 1, . . . , L — 1. 



ment with numerical simulations [461 I61L 162 
lytical approximations in mean-field models 
shown that the following homogeneity relation is valid in 
the short (but macroscopic) time regime when starting 
from an ordered condition (commonly mo = 1). 

m (n) (t, h, L) = b- nfs/v m {n) {b- z t, b 1/l, h, b^L). (8) 

In our system, and making the usual analogy with 
standard phase transitions, we consider the velocity v 
as an order parameter and the adimcnsionalizcd force 
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(/ — /c)//c as the reduced driving field. While it is not 
clear yet how to implement a protocol yielding an ini- 
tial condition equivalent to niQ << 1 and testing the 
complete relation Eq. ([?])■ the implementation of an or- 
dered equivalent initial condition is very simple, and a 
relation like Eq. dHt has been numerically checked in 
this model [U [44|. In fact, the completely ordered ini- 
tial condition should correspond to the infinite velocity 
configuration. Since at large velocities the effect of the 
quenched disorder mimics a small thermal noise which 
vanishes with increasing v, this initial condition should 
correspond to the completely flat condition (i.e. we make 
a "quench" from an infinite to a finite force) . By analogy, 
we therefore expect the velocity short time behavior 

v(t, h, L) = b- f} ' u ~v±{b- z t, b^ u h, b~ l L), (9) 

where h = \f — fc\/fc an d the function v± has two 
branches depending on the sign of / — f c . By choosing 
b = t 1 ^ in © 

v(h,L,t) = t- p / vz i ± {t 1 / zl, h,r 1 / z L), (10) 

Here, the functions v+ and V— are such that for h > 
and L ^ hr v , v + — > const, and v_ — > in the large 
t limit. It is worth noting here that while h~ v can be 
asocciated with the geometrical correlation length £ di- 
verging in the lim / — > fj in the steady state, this is not 
true in the lim / — >• /~ . We can, however, find a diver- 
gent correlation length £ re i ax ~ (f c — /) _ ", not observed 
in the steady-state geometry, but associated with the de- 
terministic part of the avalanches that are produced in 
the steady-state d yna mics of the / < f c low tempera- 
ture creep regime |l4, [l5j ■ Hence, the interpretation of 
Eq. (fTU)l is slightly different from the one derivated out of 
Eq. ([8]) for the Ising model (and other similar standard 
phase transitions), where divergent equilibrium correla- 
tion lengths do exist approaching the critical point from 
both sides. At depinning, the relaxational dynamics de- 
scribed by Eq. (jlpp is valid in a short-time regime, after 
which the velocity reaches a steady-state value if / > ,/ c , 
while for / < f c the velocity is blocked in a metastable 
state with memory of the initial condition and v — > [44| • 
When T > 0, and for / < f c , activated processes permit 
the system to continue relaxing towards a steady-state 
with finite velocity [40^ . 

Exactly at the critical point (h = 0) and in the limit 
L — > oo we expect a power-law behavior for the velocity 

v~t- /» z . (11) 

Equation (jlip can be also heuristically derived from a 
very simple physical picture, by assuming that the non- 
steady relaxation at the critical point is controlled by a 
single growing correlation length £(t) ~ t 1 ^, and that 
in the macroscopic time limit, some critical steady-state 
relations are "instantaneously" obeyed, with £(t) play- 
ing the role of £, when (<I, Hence, the steady-state 
relation v ~ ^ u translates into v ~ £(t) f3 / v ~ t~P^ z , 



leading to Eq. (|11[) . The physical picture behind this 
"mapping" is that lcnght-scales below £{t) are expected 
to be steady-state quasi-equilibrated, while above it the 
memory of the initical condition is still kept. This sug- 
gests that the structure factor should be described, above 
some microscopic scale and £(t) <C L, by 

S q (t) ~ q-(™~S(q£(t)) ~ q-^SiqtV*), (12) 

with S(y) ~ const, for i/ > 1, and S(y) ~ y 1+2 ^ for y <g ; 
1 . This is confirmed by numerical simulations [43|, |44| . 
Note that this simple scaling is also obeyed in the absence 
of disorder but in the presence of thermal noise [64j|. For 
the depinning transition S(y) is expected to be a much 
more complicated function however, as the disorder in 
principle couples all Fourier modes. The above scaling 
relation also implies 

w{t) - £{tf - t-V' (13) 

for the global width, or roughness. Note that this simple 
result depends on the flat initial condition choice. Other- 
wise we would have a memory contribution, that can be 
included as wit) ~ t~^/ z w(t x / z L^ 1 ), with w(y) ~ const, 
for y <C 1, and a function depending on the initial con- 
dition for i/> 1. The flat initial condition is thus conve- 
nient since w(y) — const, for all y. 

Since Eq. (fTTj) is valid exactly at h = 0, evolving the 
system at different driving forces from a flat initial con- 
dition is then possible to determine the critical force and 
extract the steady-state critical exponents. For the cor- 
rect application of this STD method, we need however 
a good criterion to decide the time-range where to fit 
the exponents accurately. This is particularly tricky if 
scaling corrections are important and also described by 
power-laws, since they can lead to effective power-law 
decays yielding biased incorrect exponents. As we show 
later on, for our case, we can exploit known depinning 
scaling relations and combinations of observables to find 
a good criterion. 



B. Numerical Simulation 

The numerical simulation protocol is roughly the same 
as the one used in Ref. [3], but here, in addition, we 
simulate different kinds of disorder, and implement mas- 
sively parallel algorithms. Below we briefly summarize 
the method. We refer the reader to the Appendix for fur- 
ther details; in particular, for the description of our GPU- 
based parallel implementations of the algorithm which al- 
low us to computationally reach very large system sizes, 
running simulations with specdups above 300x. 

In order to numerically solve Eq. Q]) the system is dis- 
cretized in the x-direction in L segments of size 5x = 1, 
i.e. x — > j = 0, L — 1, while keeping Uj(t) as a contin- 
uous variable. Computing the forces at each time-step, 
the integration of Eq. ([TJ is done using the Euler method. 
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To model the continuous quenched random potential 
we can either read from memory, or generate dynami- 
cally, uncorrelatcd random numbers with a finite variance 
at the integer values of u and x and use interpolation to 
get F p (u,x). In this work we consider both cubic-spline 
and linear interpolation between the random potential 
values sampled at the integers. This only changes the 
shape of the microscopic force correlator, but not the uni- 
versality class. By generating directly the force-field at 
integer positions from zero-mean uncorrelated random- 
numbers in both directions we can get a RF disorder, 
while by doing the same for the potential-field and de- 
riving the force we can get a RB disorder. In both cases 
Eq. @, with a short-range force correlator, is automati- 
cally satisfied. 

For our numerical simulations we have used periodic 
boundary conditions in the longitudinal direction, so that 
Mo is elastically coupled with ul-i- In the case where we 
construct continuous splines for the disorder potential or 
we limit ourselves to read a precomputed force field, we 
enforce periodic boundary conditions also in the trans- 
verse direction, thus defining an L x M system. On the 
other hand, when a dynamically generated disorder case 
is used, the system size in the u direction is virtually in- 
finite, although in our implementation it can be enforced 
to be periodic as well (see Appendix). 

A critical configuration u s c {x) and a critical force /* 
can be unambiguously defined for a periodic sample of 
size L x M and with a given disorder realization. They 
are defined from the pinned (zero-velocity) configuration 
with the largest driving force / in the long time limit 
dynamics. They are the real solutions of 



:d 2 z u(x) + F p (u,x) + f = 0, 



(14) 



such that for / > there are no further real solutions 
(pinned configurations). Middleton theorems [2j| can be 
used to devise an efficient algorithm which allows to ob- 
tain the critical force /* and the critical configuration 
u b c j for each independent disorder realization iteratively, 
not having to solve the actual dynamics Eq. ((TJ) nor di- 
rectly inverting the non-linear system of Eqs. (|T4l . In the 
following section we use such algorithm [ll], [l2( to study 
the finite-size effects on /* and in particular to obtain the 
appropriate thermodynamic limit f c = Hni(i„M)-*oo fc 
controlling the universal non-steady relaxation. 



IV. RESULTS 

A. The critical configuration and the 
thermodynamic critical force 

We start by analyzing the geometry of the critical con- 
figuration and the critical force in periodic s amp les of size 
L x M, using the exact algorithm of Refs. [ill. Il2|. for a 
cubic-spline RB potential. This allows us, in one hand, 
to get a very precise estimate for the exponent £ directly 
from steady-state solutions for individual samples u s c {x) 
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FIG. 1. (Color online) Structure factor of the critical config- 
uration averaged over N = 1000 samples of size L = 2048 
and M = 13777. Using the self-affinity developed at long 
wavelengths, S(q) ~ q~ ( - 1+2< '\ we fit the roughness exponent 
C = 1.250 ± 0.005. Inset: rescaled structure factor qualita- 
tively displaying the accuracy of the fit, and the discretization 
effects at large q. 



and use it to disentangle the combinations of exponents 
appearing in the non-steady universal part of the relax- 
ation. On the other hand, it allows us to get the value 
f c for the thermodynamic critical force, and show, from 
an anisotropic finite-size analysis, that it is unambigu- 
ously defined. In the next sub-section we show that the 
results obtained in this section are fully consistent with 
the non-steady dynamics: The long time geometry tends 
to the self-affine one of the critical configuration and in 
particular, f c is exactly the force for which the velocity 
relaxation asymptotically displays a power-law decay in 
time, as long as the growing correlation length remain 
smaller than the system size L 

We determine the critical configurations u 8 c (x) and the 
associate critical force f^(L,M), for several samples of 
size L x M. To determine £ we calculate the time- 
independent averaged structure factor S q of the critical 
configurations and fit the expected S q ~ q~( 1+2 behav- 
ior. The advantage of using S q over the global width, 
w(L) ~ L^, is that it allows to better detect the possible 
presence of undesirable crossover effects, and the charac- 
teristic associated length-scale. In particular, when de- 
termining (, one should be cautious about the crossover 
to the random-periodic regime, with expone nt C rp = 1.5 
occurring at the length-scale Ljj ~ M 1 ^ [52|]. In our 
case we have thus chosen M > for extracting £. In 
Fig. Q]wc observe that S q ~ q~( 1+2 holds to high accu- 
racy in the long wavelength limit (discretization effects 
appear at very short wavelengths) for a system of size 
L = 2048 and M = 13777. From a power-law fitting 
giving 1 + 2( = 3.50 ± 0.01 we deduce C = 1.250 ± 0.005, 
with better precision but still in very good agreement 
with previous estimates obtained with the structure fac- 
tor at finite velocities in the steady-state [4S| a nd with 
the scaling of the global width w(L) ~ 



In order to study the thermodynamic limit of the dis- 
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FIG. 2. (Color online) Dependence of the finite-size critical 
force /c on the longitudinal size L for periodic samples of size 
L X M with M = kL^ . The dashed line corresponds to the 
thermodynamic limit f c — 1.916 ±0.001. 



order averaged critical force, f£(L, M), we use M = fel>, 
with fc a finite control parameter. Any finite value of fc 
leads to a parametrized family of universal critical force 
distributions, ranging from the Gaussian to the Gumbel 
distributions [50j . For our present purposes, it is im- 
portant to know how the thermodynamic limit L — > oo 
depend on the aspect-ratio parameter fc. In Fig. [2] we 
show /* against L, for different values of fc. As we can 
see, the size-dependent average critical force /* tends 
to a unique value in the limit L — > oo when keeping 
fc fixed. Furthermore, at finite L, is smaller than 
f c for fc < 2, and larger than f c for fc > 2, where 
f c therefore represents the thermodynamic critical force. 
This behaviour is consistent with the crossover from a 
random-periodic depinning (fc — > 0), with a thermody- 
namic critical force smaller than the random-manifold 
one, towards the extreme-statistics random-manifold de- 
pinning with an infinite thermodynamical critical force 
(fc — > oo), with fluctuations described by the Gumbel dis- 
tribution HH . What is important to remark here [66[ , 
is that for all the range of finite values of fc we have ana- 
lyzed, the curves slowly converge to the same thermody- 
namic limit f c = lim (L j V/ )^ 00 fl\ M = kL i = 1.916 ± 0.001. 
This value is in very good agreement with the value ob- 
tained for the same microscopic model with other meth- 
ods, such as quasi-statically pulling the string with a soft 
spring [5f|. 

Therefore, we see that although f c is not universal, 
for fixed microscopic disorder correlations it attracts the 
infinite family of sample geometries that is compatible 
with the random- manifold setting M = klS . Therefore, 
this setting alone removes the ambiguity of defining the 
critical force of a periodic sample in the thermodynamic 
limit. Since the short-time relaxation of the string is con- 
trolled by a single growing correlation length £(t) ~ i 1 / 2 
describing the slow development of correlations from mi- 
croscopic scales, it can not be sensitive to the dimensions 
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FIG. 3. (Color online) Non-stationary averaged string ve- 
locity as a function of time for different driving forces, (a) 
Raw data. The best power-law curve corresponds to / = 
/ C STD = 1.915. (b) Rescaled data. Parameters / C STD = 1.915, 
/3 = 0.33, z — 1.5 and v = 1.33 were used. Curves corre- 
spond to L = M = 8192, averages were taken over iV = 1000 
samples. 

or aspect ratio of the computational box, provided that 
£{t) < L and £(i) c < M. It is then natural to expect 
that f c , which is only sensitive to the microscopic pin- 
ning correlator for the random-bond family M = kL^, 
is the force that must be tuned to target the universal 
non-steady relaxation regime. 



B. Short-time dynamics scaling 

We now turn to the study of the relaxation dynamics. 
We start by reproducing previous results [iij . with ex- 
actly the same model, but using larger systems, in order 
to make visible the effects that were neglected before. 

In Fig. |3] we observe the averaged string velocity be- 
havior against time. When the applied force / is greater 
than the thermodynamic critical force f c , v(t) saturates 
to a finite value at a time which increases as we approach 
f c from above. On the other hand, for forces smaller than 
f c , v(t) goes to after a transient, which is longer as we 
approach f c from below. Exactly at / = f c we expect 
v(t) ~ t~^l vz after a microscopic time regime. One can 
in principle use this as a criteria to determine f c . Of 
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course, it is very difficult to hit exactly / c , but what we 
can do is to bound it from above and below. As we ap- 
proach f c it takes longer and longer times (and requires 
better averages) to determine if a velocity curve for a 
given force is saturating or vanishing. Big system sizes 
help to reduce noise, since v(t) self averages. A detailed 
inspection of the simulations results presented in Fig[3ta) 
permits to determine fc TD = 1.915 ± 0.002 using the 
STD method, in complete agreement with the extrapo- 
lation to the thermodynamic limit of the exact critical 
force obtained in Sec. IIV Al This shows the consistency 
of the two methods, and from now on we refer to f^ TD 
simply as f c . 

With the value of f c , from Eq.® one can test a joint 
scaling form for the force and time-dependent velocity. 
Considering L — > oo and using b ~ i 1 / 2 in Eq.© we get 



time 



v(t) t 



P/vz 



V± 



t 1/z (f-fcT 



(15) 



where v± are universal functions, and the ± sign indicates 
if the critical point at f c is approached from above or 
from below. This scaling relation is tested in Fig. (3Jb). In 
order to appreciate the difference with previous studies of 
the same problem the following values for the exponents 
have been used: z = 1.5 0, /3 = 0.33 |H|, and v = 
1.33 [44|. It can be observed that the curves collapse 
into two sets: on one hand those with / < / c , with V-(x) 
going to zero, and on the other hand those with / > f c 
with a saturation to a finite value of v+(t). 

In FigH] we show the string averaged structure factor 
at different times during the evolution of the system in 
the presence of the applied force / = f c = 1.915. As can 
be seen in Figdja), at large length scales (small values 
of q) the system keeps memory of its initial state which 
we have chosen to be flat. At small length scales though 
(but not as small to explore the lattice effects), the ge- 
ometry of the string becomes approximately self affinc, 
being characterized by a behavior S q ~ q~ l -. 1 + 2 where £ 
is the roughness exponent. In FigHJb) we test the scal- 
ing hypothesis of Eq. (fT2")) , by using z = 1.5 [44[ and 
C = 1.25 [H,[6|]. 

As we can observe in Figures HJb) and[3Jb), the as- 
sumed critical exponents (3 = 0.33, v = 1.33, Q = 1.25, 
and z = 1.5 produce rather good collapses. We will argue 
however that these exponents are effective and have an 
appreciable bias, which causes the small deviations ap- 
preciated in those collapses. This can only be perceived 
simulating large systems and reaching several orders of 
magnitude in time. In Fig|5]we show v(t) and w(t) vs 
t at the critical force f c . Above the microscopic time 
regime of order t m ; c ~ 1 we observe in v(t) a crossover 
at a time t cross ~ 50, between two approximate power- 
law decays with exponents sa 0.16 for short but meso- 
scopic times, and w 0.13 for times larger than a few tens 
(see Fig[5]Ja)). A similar crossover is seen in FigJSJb) 
where we observe a crossover between two approximate 
power-law decays, from w 0.67 to w 0.84. This behav- 
ior is unexpected from the universal non-steady relations 
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FIG. 4. (Color online) Structure factor of the string at dif- 
ferent times during evolution at / = f c . (a) Raw data, (b) 
Rescaled data. Parameters z = 1.5 and £ = 1.25 were used. 
Curves correspond to a system size L — M = 8192, averages 
were taken over N = 10000 samples. 
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FIG. 5. (Color online) Non-stationary velocity (a) and width 
(b) as a function of time for / = f c = 1.915. Black dots 
correspond to data. Dashed red and green curves are power 
law fits in regions 1 < t < 30 and 30 < t < 21000, respectively. 
Data correspond to L = M — 8192, averages were taken over 
N = 3500 samples. 
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FIG. 6. (Color online) Velocity at the corresponding crit- 
ical forces of the four different cases defined in the text, 
(top) Raw data. Power-laws fits of the form v ~ t^^" z fox 
times t > 30 give similar results for all curves, consistent 
with P/vz = 0.129 ± 0.002. (bottom) Velocity multiplied by 
t 0A2g . Dashed horizontal lines showing the apparent scaling 
regime are displayed to guide the eye. All curves correspond 
to L = 4096, M = 8192 and averages over N = 50000 sam- 
ples. 

v(t) ~ t~Pl vz and w(t) ~ t^/* , which generally assume a 
microscopically very short transient regime before reach- 
ing the truly universal macroscopic time regime. In order 
to understand this behavior, in the next section we ana- 
lyze this crossover in detail. 



C. Robustness of the Crossovers 

The first question we can ask is how robust are the 
observed crossovers and in particular, how much they 
depend on specific details such as the intensity or the 
shape of the microscopic disorder correlator. 

For a given disorder distribution, we have first checked, 
by simulating v(t) at the corresponding critical force, 
that the crossover time (t cross ~ 50) does not change 
appreciably by changing the nature of the disorder. 

In order to study the dependence of the crossover with 
the shape and nature of the microscopic disorder correla- 
tor, we have fixed A(0), defined in Eq. @, and analyzed 
four different cases: 

• Case 1: Gaussian-distributed disorder potential 
with cubic spline interpolation between the inte- 
gers (RB), 

• Case 2: Uniformly-distributed disorder potential 
with cubic spline interpolation between the inte- 
gers (RB), 

• Case 3: Gaussian-distributed disorder potential 
with linear spline interpolation between the inte- 
gers (RB), 



• Case 4: Gaussian-distributed random forces (RF) 
without interpolation between the integers. 

Case 1 corresponds to the usual Gaussian distributed 
disorder with cubic spline we have used in Section IIVBI 
Case 2 consists in using a uniformly distributed disor- 
der instead of the Gaussian distributed numbers of the 
first case. In Case 3 we determine the continous po- 
tential using linear interpolation instead of a cubic one, 
so the pinning force as a function of the position has 
discontinuities at the integers. This first 3 Cases corre- 
spond to random -bond disorder, as th e quenched poten- 
tial fluctuations [U(u,x) — U(u',x')] 2 ~ S(x — x') sat- 
urate at long distances |« — u'\. Case 4 is a typical 
random-field with G aussian disorder, corresponding to 
[U(u,x) — U(u' ,x')] 2 ~ 6(x — x')\u — u'\ for large |it — u'\. 
Case 4 helps us in particular to answer the question 
whether the observed crossovers may be related with the 
crossover from RF to RB expected for depinning Q . 

In Fig|6] we show the time dependent velocity for the 
four cases, each one at its corresponding thermodynamic 
critical force, determined with the STD method. Apart 
from a factor depending on f c , qualitative behavior of 
v(t) appears unaltered for times longer than some mi- 
croscopic threshold £ m ; c ~ 1. This time is controlled 
by the microscopic correlator range (44| . which is the 
same in the four cases, rj = 1. If we fit pure power- 
laws for all curves in a region t > t cross 50 we find, 
within the error bars, consistent values for the exponent 
P/vz - 0.129, as shown in Table H] By plotting wt 0129 
vs. t, we clearly see that corrections to scaling at in- 
termediate times (t m i c < t < t cross) are present. They 
have an approximate algebraic decay, and seem to have 
a weak dependence with the microscopic disorder. We 
thus conclude that the observed value for p/vz is univer- 
sal as expected. Moreover, corrections to scaling appear 
to be robust: they do not strongly depend on the inten- 
sity shape, or nature (RB vs RF) of the short-ranged 
correlated microscopic pinning force. 



Case 


description 


u 


fitted P/vz 


1 


RB CS Gaussian 


1.915 ± 0.002 


0.130 ±0.001 


2 


RB CS Uniform 


1.848 ± 0.002 


0.128 ±0.001 


3 


RB LS Gaussian 


1.565 ± 0.002 


0.129 ±0.001 


4 


RF Gaussian 


0.971 ± 0.001 


0.130 ±0.001 



TABLE I. Critical forces and exponents for four different 
cases: CS and LS stand for Cubic-Spline and Linear-Spline in- 
terpolations for the potential, respectively. RB and RF corre- 
spond to random-bond and random-field cases. Power-law fits 
are consistent with an universal value P/vz = 0.129 ± 0.001. 



The crossovers in Figf5] can be thus possibly ascribed 
to leading order power-law scaling corrections to the uni- 
versal non-steady dynamics, similar to what is observed 
in Ising, Potts and XY models for instance (67l-l69l| . As it 
is customary, one can try here to fit a "corrected" formula 
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FIG. 7. (Color online) ^(t) as function of time for Case 1 
and different system sizes at / = f c = 1.915. Vertical dashed 
lines separate qualitatively the different dynamical regimes 
observed. Curves correspond to M = 32768, averages were 
taken over N = 5000 samples for L = 512, 1024, 2048, N = 
3000 samples for L = 4096 and N = 1700 samples for L = 
8192. 



in order to get an unbiased value for /3/vz: 



v = vot- 0/ - 



1 + [ T 



(16) 



However, with the present data, we find that this proce- 
dure is inaccurate as it involves several parameters intro- 
duced through an ad-hoc formula for the correction. 

This takes us to our next step, the search for a different 
practical criterion to directly estimate t cross . We start 
observing that in the truly universal non-steady macro- 
scopic critical regime, the quantity 



7(t) 



tv(t) 



(17) 



should reach a time- independent constant value as a con- 
sequence of the scaling relation (3 = v(z — £): since 
at f = f c in the macroscopic regime, v(t) ~ t~^l vz , 
w(t) ~ t^' v ', then y(t) ~ const. This criterion was used 
to determine the depinning threshold in the long-range 
discrete elastic model [l3|], since any departure from f c 
would make 7 to vanish or to increase with time at very 
long times. For our present purposes, we note that if the 
measured y(t) depends on time, it means that we have 
not reached the critical scaling regime yet. Therefore, a 
constant 7(i) is, at least, a necessary condition. 

In FigJT] we show the behavior of j(t) with time for 
Case 1 . After a microscopic regime we can see a decreas- 
ing behavior of j(t) with time, which implies an "effec- 
tive" unbalanced relation [v(z — Q — f3] cS < 0. The sys- 
tem is then in what we call a "mesoscopic" regime where 
the critical dynamics scaling [Eq. ©] is still not valid. 
Nevertheless, it is possible to fit a reasonable power-law 
for v(t) in this regime, if larger times are not available. 
We hence see the importance of having a good criterion 
to decide where to fit exponents: if time is not large 
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FIG. 8. (Color online) 7(f) as function of time for the RB case 
with Gaussian distributed disorder. The system size is L = 
2 25 = 3355 4 4 32. The continuous line is a fit using Eq. (TTg)) . 



enough, the resulting exponents would be effective and 
have a systematic bias. Moreover, using 7(t) as a method 
to determine f c may also suffer from these "mesoscopic- 
time effects" , as its initial decay may be incorrectly as- 
cribed to having an applied force being larger than f c . 
After that long crossover, we can observe that j(t) de- 
velops a plateau starting at t ~ 1000. This plateau is cut- 
off by finite-size effects, which introduce an L-dependent 
characteristic time. We also observe that scaling correc- 
tions in v(t) or w(t), shown in Figs(5l and El are impor- 
tant in the mesoscopic time regime, clearly visible in the 
time-dependence of j(t). We thus argue that within the 
plateau in j(t), scaling corrections become already negli- 
gible, and the true critical exponents can be consistently 
extracted with the STD method. As we see in FigJTl this 
criterion is strongly limited by finite size effects, reducing 
the time range of the plateau. We observe that in order 
to obtain a sufficiently wide time window to fit accurately 
the exponents (i.e. a few decades in time), we really need 
to attain big system sizes, much larger than L = 8192. 



D. Large Scale Simulations 

With the numerical method implemented so far, in pe- 
riodic samples of size LxM, we are limited [7(| to system 
sizes not much bigger than L = 8192. This is because we 
need sufficiently large M in order to avoid undesirable pe- 
riodicity effects, i.e. a crossover to the random-periodic 
class [531 ] . We hence move to our "memory free" numer- 
ical implementation, where we dynamically generate the 
underlying disorder instead of reading it from a previ- 
ously stored array, similarly to what was done in Ref. (37j 
but with concurrent computations. As we explain in the 
Appendix, in this implementation it is simpler to work 
with linear splines for interpolating the potential, there- 
fore corresponding to the Cases 3 (RB) and 4 (RF) of 
section HV CI for which we have already shown that scal- 
ing corrections are present. 

In Fig[5]we present the behavior of 7(4) with time for 
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FIG. 9. (Color online) String width w(t) as function of time 
for the RB case with Gaussian distributed disorder. The sys- 
tem size is L = 2 25 = 33 5544 32. 



FIG. 10. (Color online) String velocity v(t) as function of 
time for the RB case with Gaussian distributed disorder. The 
system size is L = 2 25 = 3355 4 4 32. 



a string of size L = 2 25 = 33 5 5 4 4 3 2, for Case 3. Simulat- 
ing such large systems benefits us with the self averaging 
of the velocity. Indeed, in Fig|5]the smooth curve at f c 
was averaged only over 5 samples, while the other two 
correspond to only one sample. Secondly, notice that, 
comparing with Table HI we are now able to improve the 
determination of the thermodynamical critical force with 
the STD method, pushing the uncertainty to the 4th dec- 
imal cipher, f c = 1.5652 ± 0.0003 for Case 3. At this 
value of f c we obtain a well defined plateau in 7, lasting 
approximately two decades, starting at t ~ 1000. Ignor- 
ing the microscopic time regime, t < t m i c ~ 1, we can 
now perform a good fit of the data at / = f c with the 
expression 



l(t) = Go 



(18) 



Here we have introduced the fitting parameters Go, t g 
and a g , obtaining Go — 1.143, t g ~ 0.142 and a g ~ 
0.56. This allows us to quantify the development of the 
plateau, and to confirm the power-law shape of the scal- 
ing correction in 7(i). We are now prepared to consis- 
tently fit both, the scaling corrections in w(t) and v(t) 
and the combination of critical exponents Qfz and /3/vz 
respectively. 

In Figs. P and ITU1 we show w(t) and velocity v(t) as 
a function of time, for L = 2 25 = 33 5 5 4 4 32 and for an 
applied force f c = 1.5652, corresponding to a RB linearly 
interpolated random-potential generated with Gaussian 
numbers. In Fig. [TUJ we have also included the behavior 
of the string for two other forces, just above (/ = 1.5655) 
and just below (/ = 1.5649) the estimated depinning 
force f c . For w(t) the corresponding three curves are in- 
distinguishable, so we only show the curve corresponding 
to / = f c . We can see very nice asymptotic power-law 
behaviors of both quantities spanning several order of 
magnitude. Nevertheless, as noticed for y(t) in Fig|51 
a proper power law fit can only be accomplished after 
deciding a starting point of the scaling regime. With 
the criterion of a plateau in j(t) such a starting point 



could be decided up to some arbitrariness, but it can 
be included in the final uncertainty of the fitted expo- 
nents. In addition, we propose corrected expressions for 
w and v, that can be used to fit the data during all the 
non-steady evolution, excluding the microscopic regime 
< t < t m ic ~ 1. The corrections to scaling in these 
quantities then read 



w 



(t) = Wo t c/2 



v(t) = V Q t-V* 



' / 1 • 



(19) 



(20) 



The fits give W ~ 0.839, t w ~ 0.485 and a w ~ 0.712 for 
the width, and V ~ 0.721, t v ~ 0.110 and a v ~ 0.715 for 
the velocity. The insets of Figs. |H] and [TU] confirm that 
corrections can be fairly described by power-laws, as ob- 
served earlier in Sec. lIVCl for smaller systems. The fitted 
parameters are to some extent sensitive to the choice of 
t m ic but this arbitrariness can be included in the error 
bars, and the proposed form of the corrected critical be- 
havior is robust. 

For completeness, we also present the corrections to 
scaling for the RF case. In Fig. [TT] we can see the be- 
havior of the string velocity with time when we apply 
the critical force for the RF case f c = 0.9713 ± 0.0002 
and different forces close to f c . The complete picture of 
the RB case is repeated, but now the expression Eq. (|2TJ)) 
gives back slightly different exponents for the corrections 
to scaling. The fits give Vb — 0.53, t v ~ 0.382 and 
a v ~ 0.927 in this case. Nevertheless we cannot confirm 
if the correction exponents are distinguishable or not be- 
tween RB and RF cases, since they are very sensitive to 
the choose of the arbitrary quantity t m i C , the time from 
which we start to fit. For example for the RF case a v 
changes between 0.69 and 0.99 when varying t m i C from 2 
to 10. 

However, for the RF case we find a /3/uz which is indis- 
tinguishable from the one extracted for the RB case, as 
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FIG. 11. (Color online) String velocity v(t) as function of 
time for the RF case with Gaussian distributed disorder. The 
system size is L = 2 25 = 33 5 5 4 432. 
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FIG. 12. (Color online) The combination of exponents l—^/z 
and P/vz extracted from fittings to the data of Figs. [^land HUl 
respectively. The horizontal axes shows the chosen values of 
three different arbitrary quantities for the fitting procedure. 
In black we show the time t m ic on and after which we fit using 
the corrected expressions ()19p and (|20[). In red its shown the 
time tpi a t from which we can consider 7(i) is constant, and we 
fit w(t) and v(t) with pure power-laws v ~ f' 3 /" 2 and w ~ f^ z , 
without corrections. In green we show the exponents as a 
function of to, withv to such that we fit the data with pure 
power-laws in a window [to,4to]. 



detailed in the following section. With the precision we 
have we can not rule out however that these exponents 
are actually the same for the RB and RF cases, and thus 
maybe universal. 



E. Critical exponents 

We are now in position to accurately and consistently 
determine the critical exponents (3 and z from the non- 
stationary dynamical behavior of the system. In FigfT2l 
we show our determination of f3 /vz and £/z (presented as 
(1 — (/z) since these are expected to be equal) extracted 
from v(t) and w(t) respectively. To check the consistency 
of the fitted values we have used three different fitting 



procedures: 

• First, as it is customary, we use the corrected ex- 
pressions (fl~9|) and (|20|) to extract {Q/z) and ((3/vz), 
respectively. The fit should be performed above 
the microscopic regime t > t m i C . Since the "micro- 
scopic" time t m ic is not precisely defined we let it 
vary in a reasonable range t € [2, 20] and compare 
the values fitted for ((/z) and (ft/vz) according to 
CSJ and HUD. 

• Second, we use the "plateau in 7" criterion to fit 
our exponents. This is, we first observe the be- 
havior of j(t), and propose a time t p i a t, above 
which we can consider the plateau well developed. 
From t > tpi a t we fit separately w(t) and v(t) with 
the pure power-law expressions w = wot^^ z and 
v = vtfPl vz , respectively. Again, since t p i at is not 
precisely defined, we allow it to move in a wide win- 
dow tpiat € [500, 40000] and observe how the fitted 
exponents ((/z) and {fi/vz) behave. 

• Third, we perform pure power-law fittings of our 
data, as in the previous case, but now we perform 
fits within time windows t S [to, 4to] where to is ar- 
bitrary chosen in a wide range of values. This is to 
show how the resulting exponent changes depend- 
ing on the selected time window and to mimick the 
situation where long-time runs of large systems are 
not available. 

FigHH shows that the three fitting procedures lead to 
the same result /3/vz « 1 — (,/z w 0.128, for sufficiently 
large t m i C , t p i at and to. Interestingly, the values fitted for 
1 — Q/z and fi/vz in the third procedure arc very sensitive 
to to, and show a clear tendency to produce a positive 
bias on both at small times. This explains the prev ious 
observations of larger effective values for (/3/Vz) [4J]. In- 
creasing to, while keeping the window selection as [to, 4frj] 
(other choices different than the factor 4 give the same 
qualitative result), the exponents values decrease and at 
some point the dependence with to ceases. This coin- 
cides with the beginning of the range of good values for 
tpiat ~ 1000. From this analysis we conclude for the 
depinning transition of the QEW elastic line, that: 



P/{vz) = l-(/z = 0.128 ± 0.003. 



(21) 



Combining with the value of C = 1-250 ± 0.005, obtained 
in Sec. IIV Al with a reliable exact steady-state method, 
this finally gives 



1.433 ±0.007: 



(22) 



and considering the statistical tilt-symmetry of the 
model, which yields v = 1/(2 — £), we obtain 



and 



v= 1.333 ±0.007 



0.245 ±0.006. 



(23) 



(24) 
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FIG. 13. (Color online) String velocity v(t) as function of time 
for the RB case with uniform distributed disorder for wich 
f c = 1.5652 using St = 0.1. The system size is L = 4194304. 
On the left panel we present the raw data, on the right panel 
v(t, f) has been rescaled to vt p/uz and t to t 1/z \f - f c \ v . 
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FIG. 14. (Color online) 7(i) as function of time for the RB 
case with uniform distributed disorder for wich f c = 1.5652 
using 8t = 0.1. The system size is L = 4194304. On the left, 
the raw data, on the right t rescaled. 



F. Scaling relations around the critical point 

Now that we know the accurate critical exponents of 
the model, we test them in large scale simulations around 
the critical point, i.e. at force values around the critical 
force f c . 

In the left panel of Fig|T3] we present the behavior of 
v(t) with time, at different driving forces for a string of 
size L = 2 22 = 4194304. In the right panel we scale 
the data according to Eq. (jl"B"j) with the exponents just 
obtained. Compared with Fig. [3] for a much smaller sys- 
tem, we now get a much better scaling, specially for large 
times, and since we are using now the new set of critical 
exponents. 

Consistently, in the left panel of FigOJ] we show that 
the quality of the collapse in the right panel of FigfT^] is 
closely accompanied by the development of a plateau in 
7(t) with time at / ~ f c , for t > 1000. For forces greater 
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FIG. 15. (Color online) Heuristic connection between steady- 
state and the non-steady universal relaxation at the thermo- 
dynamical critical force f c . The circle represents the system 
and the short arrows indicate its non-steady time evolution. 
At large times, in the critical region, we can think the relax- 
ation of the velocity as "driven" by the small finite-size bias of 
the critical force / c -/ c (£) ~ at the scale t = i(t) ~ t 1/z , 

where / c is the thermodynamic value. Although the geometry 
of the interface beyond £(t) still retains memory of the initial 
condition, these large wavelength modes would not affect v(t). 

or lower than f c , 7 deviates from its flat behavior towards 
decreasing or increasing, respectively. Using that in the 
scaling region we expect v(t,f) ~ t~P/ uz G v {\f — / c | I/ i 1/ ' z ) 
w(tj) ~ &'G w Qf ~ f^t 1 /'), with G v (x) and G w (x) 
some universal functions we get 7(i, /) ~ F (|/ — f c \ v 't 1 ' z )i 
with a new universal function F(y). This scaling is 
checked in the right panel of FigQ3]and again, the quality 
of the collapse is closely accompanied by the development 
of a plateau in j(t) at / w f c , occurring for t > 1000. 
This shows that our criterion for bounding the scaling 
region is consistent. 

V. DISCUSION 

We have found and characterized scaling corrections 
in the non-steady universal relaxation at the depinning 
transition. We have shown that these corrections are 
power-law like, and they are rather robust. Fortunately, 
we have found that scaling relations between the true 
exponents are violated by the effective ones (see Fig. IT!?]) , 
suggesting a practical criterion of consistency that we 
have tested using large-scale simulations. This criterion 
has allowed us to reduce the effects of the corrections and 
improve significantly the accuracy of the STD method for 
determining the critical exponents. 

Power-law corrections to the non-equilibrium scaling 
were observed in several other systems such as Ising, 
Potts and XY models (63413 • Interestingly, it is found 
that these corrections are stronger for systems with dis- 
order or frustration, or dilution [46| . Corrections are usu- 
ally used to improve the accuracy of the exponents ob- 
tained by the STD method. For the depinning transition 
we have shown that this practice does not increase much 
the accuracy of the STD method. We think that one 
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reason is that we rely in an ad-hoc model for the correc- 
tions, add extra parameters to the fit, and we still have 
to decide where to start fitting the corrected formula (the 
non- universal microscopic time). We have shown that a 
better approach for depinning is to use scaling relations 
between the exponents that are violated in the presence 
of corrections. This allows us to find the truly assymp- 
totic regime where we can expect to get very close to the 
true exponents. We think it may be useful to try this 
kind of approach in other problems. 

An heuristic explanation for the power-law corrections 
can be proposed by relating Eq. (fTTj) with the steady- 
state equation v ~ £ > ~PI V ' . As we already pointed out, 
replacing £ by the correlation length £(t) ~ t 1 ^ yields 
Eq. (|TT|) . obtained from general arguments. This suggests 
two simple pictures. 

In the first picture we simply introduce phenomeno- 
logical corrections to the dynamical exponent as £'{t) ~ 
t 1 / z (l + | C3 1 £ ~ e ) with C3 and e parameters of the cor- 
rection. This type of correction has been introduced 
in studies of the off-equilibrium critical dynamics of the 
three-dimensional diluted Ising model, for instance, and 
attributed to the big gest irrelevant eigenvalue of the RG 
in the dynamics [71(. If we assume that v(t) ~ £'(t)~P/ u 
holds, it is easy to see that such correction would lead 
to a corrected velocity v(t) ~ t~^l zv {\ + \c2\t~ a ) with 
parameters C2 and a. Something similar would happen 
with w(t). Note however that if w(t) ~ t'(tp also holds, 
we would not explain the corrections observed, in the 
same time-regime, in y(t) = w(t)/v(t)t. In any case, 
it would be interesting to sec whether such corrections 
for £(t), which are certainly absent in the non-steady re- 
laxation of a flat line in the EW model at T > for 
instance (and more generally of non-markovian Gaussian 
signals [zl, [73| related with manifolds with long-range 
elasticity), can be explained from the complicate coupling 
between Fourier modes induced by the non-linearity of 
the quenched disorder. It is known that "non- Gaussian" 
effects are indeed subtle [73, [7j| . 

In the second picture we assume that the universal non- 
steady relaxation at the thermodynamic threshold f c is 
actually "quasistatically driven" by the finite-size bias of 
the critical-force. In other words, the small velocity of 
the large string is roughly controlled by the steady-state 
dynamics of an effective string of increasing size £(t) and 
width w(t) ~ £{tf, as v(t) ~ [f c - f c {£(t))f. Indeed, if 
we assume that this finite-size bias with respect to the 
thermodynamic value f c is positive, and scales with £ 
exactly as the finite-size critical force fluctuations around 
its mean value, i.e. f c — f c (L) ~ L 1 ^, we get 

«(*) ~ [fc - fMt))f ~ Ut) P,v ~ (25) 

equivalent to Eq. (|11[) . This picture is schematized in 
Fig. [T5] The assumption of a positive bias f c — f c (L) ~ 
L~ x l v can be justified from results using the quasi- 
statically velocity- driven ensemble, where the interface 
is driven by a parabolic potential m 2 (vt — u). For small 
to, it was shown, via functional renormalization group, 



that f c (m) = f c - \ci\m 2 c = f c - Ic^m 1 ^, with a a 
constant [56[. Since the parabola imposes a character- 
istic length L m ~ m , beyond which the driven in- 
terface looks flat, we can write [f c — f c (m)] ~ Lm . 
Identifying L m with £(t), we explain the assumed pos- 
itive bias and get again Eq. (fTTj) . This identification 
of lengths is supported by the fact that both, L m and 
£{t) (for the initially flat line) represent the same geo- 
metrical crossover. For lengths below £{f)(pi L m ), the 
string is steady-state quasi-equilibrated with the pinning 
landscape yielding the roughness exponent £, but it re- 
mains macroscopically flat due to the memory of the ini- 
tial condition (due to the confinement of the parabola) 
at scales above £(t)(oi £,„). With the picture above 
we can now speculate about a possible mechanism for 
generating the scaling correction. Since the formula 
fc(£) = fc - \ci\£~ 1/v is expected to hold only in the- 
limit of large I it is plausible to add corrections to it, 
which decay to zero in the large £ limit. If we add a 
correction of the type f c (£) = f c - \c 1 \£- 1 / v {l + \c 2 \£~ a ) 
with a > and c-i constants, and use v{£) ~ [f c — / c (^)] /3 , 
we get v(t) - t-Pf* v {l + \c 2 \t a / z ), using £ = £{t) = t 1 ^, 
the same type of correction we observe in our simula- 
tions. The correction just proposed is not completely 
ad-hoc, but qualitatively consistent (same sign and order 
of magnitude) with the corrections to the FRG formula 
/ c (to) = f c — |ci|m 2_< » observed at intermediate values 
of to in simulations in the quasistatically velocity-driven 
ensemble (56|. It would be thus interesting, in first place, 
to check quantitatively whether such corrections are di- 
rectly related with the ones we report here, and in sec- 
ond place to check whether one can explain them directly 
from the FRG-flow behaviour. This picture for the cor- 
rections suggests that both finite-size steady-state simu- 
lations or finite-time non-steady simulation would yield, 
in particular, an effective exponent for (3 larger than the 
true one. This may explain the trend of discrepancies 
found in the literature. In any case, an analytical and 
more fundamental description of these effects would be 
welcomed. 

The critical exponents associated to the depinning 
transition for the QEW model in one dimension as re- 
ported in previous works are shown in Table [TT1 toghcthcr 
with the values obtained in the present work. Comparing 
previous estimates, the largest discrepancy is found for 
the (3 exponent. Our value j3 = 0.245 ± 0.006 is more 
precise than previous reported values, agrees well with 
the ones obtained in automaton models [9j and in molec- 
ular dynamics simulations [37l | , but disagrees appreciably 
with respect to random-field Ising model simulations [38j 
and other molecular dynamics simulations [48j | which give 
about 30% larger values. All these works correspond to 
the steady-state dynamics. They thus rely on a proper 
steady-state equilibration and on a precise knowledge of 
the critical threshold, which is difficult to achieve, except 
in periodic samples [ll|, |48| . In steady-state simulations 
with dynamically generated disorder, such as the ones in 
Ref. [33], one should be cautious at long times or large 
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exponent 


estimate 


reference 


c 


0.97 ±0.05 


[37] 




1.25 ±0.05 


M 




1.25 ±0.01 


[76] 




1.2 ±0.2 


[6] 




1.26 ±0.01 


[65] 




1.250 ±0.005 


this work 


z 


1.42 ±0.04 


[9] 




1.54 ±0.05 


[76] 




1.35 ±0.2 


M 




1.433 ± 0.007 


this work 


V 


1.05 ±0.1 


[37] 




1.00 ±0.05 


[38] 




1.1 ±0.1 


[76] 




1.25 ±0.3 


[6] 




1.29 ±0.05 


[48] 




1.33 ±0.01 


[50] 




1.333 ±0.007 


this work 


p 


0.25 ±0.03 


[33] 




0.24 ±0.01 


[37] 




0.34 ± 0.04 


[38] 




0.40 ±0.05 


[76] 




0.25 ±0.03 


M 




0.22 ±0.02 


[7JJ 




0.2 ±0.2 


® 




0.33 ±0.02 


[48] 




0.245 ± 0.006 


this work 



TABLE II. Representative values for the depinning exponent 
the dynamical exponent z, and the critical exponents v and 
P, reported in the literature. The major dispersion is seen in 
the value of /3. 



center of mass displacements, of the critical force extreme 
statistics. Since, in this case, f c can be considered as the 
maximum among ~ M / independent typical critical 
forces [13, [BTL [78|. Therefore, a finite- velocity steady- 
state might not exist at zero temperature if the critical 
force statistics tends to Gumbel's type for large M/ L 1 * for 
instance, as the interface will eventually be blocked (by 
virtue of Middleton theorems [29[) at any finite- force. It 
is not clear how a finite temperature can cure the dom- 
inant effect of these rare events at very large averaging 
times, or eliminate the sensibility to the tails of the dis- 
tributions. At T = 0, one possibility is to perform a 
disorder average over finite time averages of the steady- 
state velocity, such that the distance covered by the cen- 
ter of mass is of order L^, and thus the critical force 
becomes typical again. The problem with this approach 
is that this critical force fluctuates and thus the control 
parameter / — f c from sample to sample, complicating 
the direct estimation of (3. At this respect finite samples 
with periodic disorder have the great advantage that we 
can calculate accurately the critical force for each sample 



using an exact algorithm [ll|. Steady-state simulations 
exploiting this algorithm were limited by the narrow scal- 
ing region for getting /3 [48[ however, bounded by finite- 
size effects on one side, and by the effects of the crossover 
to the Edwards- Wilkinson equation at large velocities on 
the other side. 

The STD method applied to the depinning transition 
has the advantage, on one hand, that it docs not suffer 
from finite-size effects, as it assumes a growing correlation 
length which should lie well below the system size. On 
the other hand, it does not suffer from extreme statistics 
neither, as the spatial region covered in the scaling region 
is at maximun of order and thus the critical force can 
not be rare. Moreover, we have shown that it actually 
detects the thermodynamical critical force corresponding 
to the random-manifold family M = kLf- , separating the 
random-periodic case for k — > 0, and the extreme Gumbcl 
case k — > 00 [5(| ■ 

In concrete, we have shown that in order to have a long 
power-law decay in v(t) we must use a large system with 
the force precisely tuned to the thermodynamic critical 
force f c . Fortunately, as shown in FigJSJ f c can be un- 
ambiguously and precisely determined from a finite-size 
analysis of the precisely known sample critical force f£ of 
periodic samples. It is also usually said that the STD has 
the advantage that we do not need to (steady-state) equi- 
librate the system. The presence of power-law like scaling 
corrections shows, however, that we do need to wait (or 
correct) until the truly assymptotic universal non-steady 
regime is reached, before determining the exponents. It 
is also worth noting that the STD method does not yield 
j3 or £ directly, as the direct steady-state method does, 
but the combination of exponents j3/vz. This is not very 
probelmatic however, since for the QEW model we ac- 
curately determine £ from the analysis of large critical 
configurations [65], v = 1/(2 — Q, from the statistical 
tilt-symmetry exact relation, and Q/ 'z, from the power- 
law increase of the width w(t), as shown in Fig. [9j It 
would be thus interesting to try the STD method in the 
computationally more convenient automaton models be- 
longing to the QEW universality class. Such models are 
described in pioneer papers such as Ref. @. Since by 
construction they involve a kind of coarse-grained dy- 
namics, it would be instructive to study the non-steady 
dynamics of this model and see what happens with the 
mesoscopic time-regime reported here. As the goal of this 
paper was not only to improve the STD method for the 
determination of the critical exponent depinning for the 
transition, but also to study a more realistic model for 
the non-steady short-time dynamics, which itself might 
be experimentally relevant, we leave such a study for the 
future. 



VI. CONCLUSIONS 

We have studied the non-steady relaxation of a driven 
one dimensional elastic interface at the depinning tran- 
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sition. Above a first, non-universal microscopic time- 
regime, we have found a non-trivial long crossover to- 
wards the non-steady macroscopic critical regime, ex- 
pected from general scaling arguments. We have shown 
that this "mesoscopic" time-regime is robust under 
changes of the microscopic disorder including its random- 
bond or random-field character, and can be fairly de- 
scribed as power-law corrections to the asymptotic scal- 
ing forms yielding the true critical exponents. These cor- 
rections may explain some numerical discrepancies found 
in the literature (as large as 30% for some exponents) , for 
this universality class. In particular they explain the ap- 
pearance of effective power-laws in the non-steady relax- 
ation with exponents presenting a systematic bias with 
respect to the critical values. To improve the accuracy 
and consistency of the STD method for extracting criti- 
cal exponents, we have implemented a practical criterion 
of consistency and tested it in large-scale (L = 2 25 ) sim- 
ulations concurrently implemented on GPU. In this way 
we obtained accurate exponents for the universality class 
of the paradigmatic continuum displacement quenched 
Edwards- Wilkinson equation. This method may be used 
to analyze exponents in different depinning universal- 
ity classes, such as the long-range elasticity, quenched 
Kardar-Parisi-Zhang, or correlated disorder classes ff^ . 

We believe that the features here observed might be ex- 
perimentally relevant. In many experiments at low tem- 
peratures with magnetic or electric domain walls, or with 
contact lines of liquids, the dynamics of the interface is 
non-stationary, in the sense that the system keeps for a 
while a memory of the initial conditions (i.e. the mem- 
ory of the preparation or "writting" of the domain wall is 
visible in the experimental time and length scales), and 
a growing dynamical correlation length is slowly devel- 
oped in presence of a driving field. The STD method 
might be thus applied experimentally to determine the 
critical field and exponents, and ultimately to determine 
the universality class of the system. In particular, we 
have shown here that scaling corrections are important 
in a "mesoscopic" time-regime, between the so-called mi- 
croscopic and the macroscopic time-regime. It would be 
important to determine whether this regime can be ac- 
cessed experimentally At this respect we note that in 
systems with weak pinning, where the microscopic time- 
regime t < t m i c is determined by a large Larkin length l c , 
we have t(t mlc ) ~ t*{,f c ~ i c Q. If t mic is large enough, 
the mesoscopic time-regime might be observable. In such 
cases, our criterion of consistency may be useful, as we 
only require to monitor the time dependence of the inter- 
face width and velocity simultaneously (by direct imaging 
for instance). This would allow to disentangle corrections 
from the truly universal part of the relaxation and get 
accurate critical steady-state exponents experimentally. 
Finally, the inclusion of thermal fluctuations would be 
important in order compare with some experiments. If 
the temperature is small enough, we expect a critical uni- 
versal relaxation yielding the zero-temperature depinning 
exponents up to lenght-scales comparable with the ther- 



mal rounding length £t ~ T~^ v l^ or up to time-scales of 
the order of The STD method can thus be also used 
to determine ip 
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Appendix: GPU-based numerical implementation 
1. Why GPGPU Computing? 

About six years ago NVIDIA, one of the main graph- 
ics processing units manufacturer, decided to take a new 
path in GPU design and launched the first GPGPU 
(General Purpouse Graphics Processing Unit), the G80 
graphic card. It was followed by the first CUDA SDK in 
2007 [8(j , a software development kit tailored to program 
the G80 using C language plus minor extensions. Since 
then, the use of GPGPUs in a wide range of applications, 
including scientific research, have growth exponentially. 
Immediately, other manufactures have entered into the 
arena of GPGPU development. Nowadays it is immi- 
nent our access to heterogeneous architecture processors 
(merging a CPU and a GPU on a single chip) and big 
clusters have already add GPUs to increase their com- 
pute power [8lj ] . 

Heterogeneous architectures came to mitigate the tech- 
nical barriers emerged in the development of faster and 
faster processors [831 ■ From a decade on, the gain of 
GFlops in modern computers is given more by their abil- 
ity of processing applications in parallel than by their in- 
crease in processor clock frequency as it is reflected in the 
multi-core processors of our laptops and PCs. As a conse- 
quence, software was pushed to be programed for parallel 
performance. The improvement of compilers relieves in 
some cases the lack of parallel implementations, but in 
many other, applications have to be re-formulated to fit 
new architectures. In this sense scientific-computing soft- 
ware is not an exception and we should rethink our usual 
codes and numerical simulation techniques. 

Independently of their manufacturer, generation or 
model, all GPUs share the same Single Instruction 
Multiple Thread (SIMT) concurrency paradigm in or- 
der to exploit their high parallelism and high memory 
bandwidth jB3j. Basically, the programming framework 
using SIMT paradigm consists in coding for an unlimited 
number of parallel threads for all practical purposes [IH 
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(typically one thread per component of our system). A 
remarkable point is that, from the programmer perspec- 
tive the actual number of cores in our GPU is not impor- 
tant at first glance. Memory issues, on the other hand, 
arise as the main problems to work out. We code as we 
were given an unlimited number of parallel processing 
units, the compiler and a formidable thread scheduler do 
the rest for us [85|. By knowing better our CPU's charac- 
teristics (memory levels and sizes, number of cores, com- 
pute capabilities, etc.) we can improve the code, orien- 
tate the compiler and take real advantage of the hardware 
potential. 

In this work, we show how powerful and useful can be 
the massive parallelism of GPUs in the framework of sta- 
tistical mechanics simulations, even without developing 
highly optimized codes. 

In short, the parallel simulation framework work as fol- 
lows. The main stream of the program runs in a single 
core of the CPU (the "Host"). The GPU (or "Device") 
is connected to the host through a high speed 10 bus slot 
PCI-Express. The GPU has its own device memory, up 
to several gigabytes. The Host runs the program's serial 
parts, manage data transfers between Host and Device, 
modifies device memory, and launch kernels that are ex- 
ecuted concurrently by hundreds of cores on the Device. 
These kernels are functions that run on the Device acting 
exclusively on device memory. 

We work with NVIDIA GPUs, as it is our available 
hardware. Simulations for the present work have been 
run on Fermi architecture (GF100) GPUs, in particular 
GeForce GTX 470, GeForce GTX 480 and Tesla C2075. 



2. GPGPU implementation of the QEW model 

In the last few years the use of GPUs to accelerate sim- 
ulations has burst out in many areas of Physics and sci- 
ence in general. In Statistical Physics several works have 
been devoted to report implementations of spin models 
on GPU architectures (se e I86l-{89i and references therein) 
and molecular dynamics |90ll9ll]. for recent examples. In 
particular, in the subject of interfases and surface growth 
models concurrent implementations of the KPZ model 
dynamics were also reported [H, [93| . All these papers 
evidence the benefits of working with GPU's implemen- 
tations in statistical physics. 

For the present work, we have implemented a GPU- 
based parallel implementation of the QEW model. Our 
codes are written in C++ and C for CUDA. At it was said 
in section UlI Bl we numerically solve Eq. ([1]) discretizing 
the system in the x-dircction in L segments of size Sx = 
1, i.e. x — > j = 0,...,L — 1, while keeping uj(t) as a 
continuous variable. 

Essentially, at each step we compute in parallel the lo- 
cal forces acting on each segment Uj, adding the elastic 
force given by the Laplacian, the pinning force coming 
from the disorder potential, the external driving force 
and eventually a random Langevin noise playing the role 



of a thermal bath. Given the instantaneous local forces, 
an Euler evolution step is performed also in parallel. 
Moreover, observables calculations and averages are com- 
puted also by parallelized routines taking full advantage 
of the GPU concurrence. 

To model the continuous quenched random potential, 
we can either read or dynamically generate uncorrelated 
random numbers with a finite variance at the integer val- 
ues of u and x and use interpolation to get F p (u 7 x). We 
have developed two different codes realizing these op- 
tions. In our first approach, the disorder potential is 
determined from a precalculated L x M array of random 
numbers taken from a given distribution. This is none 
but the parallelized version of the code used in previ- 
ous works [36|, 0, 0|- Our second approach consists, 
instead, in generating the disorder dynamically as the 
string moves. This new approach allows the system size 
in the u direction to be virtually infinite, but a period 
for M can be imposed if desired. In both approaches we 
have used periodic boundary conditions in the longitudi- 
nal direction, so uq is elastically coupled with ul-i- 

a. Parallelized QEW line on a preset disorder potential 

In our first implementation the disorder potential is 
constructed from a L x M matrix of real values V(i,j) 
taken randomly from a Gaussian or Uniform distribution. 

Given V(i,j), we associates each value with a discrete 
position i = \ u\ , j = x (where \_y\ stands for the discrete 
part of a given continuous variable y). We can either 
construct a continuous spline for each x by (linearly or 
cubicly) interpolating the values V(i,j) for fixed j, in 
which case we have a random bond potential and the 
pinning force F p (u,x) can be derived for each value of 
u. Alternatively, we can consider V(|_uJ,i) as a step 
function force itself, in which case we have a random 
field. 

Both the string u{x) and the pinning force F p (u,x) 
can be optionally chosen to be float or double type 
variables. We have not noticed precision effects in our 
test runs. Nevertheless, for all results presented in this 
work we ensure u(x) to be a double precision array, since 
it is an accumulative array during the string evolution. 

The simulation of each sample consists of the following 
steps: 

1) Choose an initial condition Uj(t = 0) for the string. 

2) Set the disorder matrix V(i,j). 

3) If a cubic spline is desired, calculate the potential's 
second derivative in each integer for interpolation. 

4) Run the desired amount of updates (time steps) 
consisting of: 

— Calculate the total force at each point Uj of 
the string (including interpolation of the dis- 
order spline when is needed). 
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— Evolve u(x,t) with an Euler integration. 

— At selected times, calculate and accumu- 
late for future averages the string's center of 
mass displacement u cm (t), velocity v(t) and 
quadratic width w 2 (t). 

— Eventually, calculate and accumulate the 
structure factor S q (t). 

5) Average and print results. 

Arrays of Steps 1 and 2 can be either computed on 
the Host and then copied to the Device or directly com- 
puted on the Device. For the case where we need cubic 
splines of the disorder potential, we also need to compute 
an array holding the second derivative of the potential 
V(|_wJ , z), this is make in Step 3. Since this is done only 
once for each sample, we simply implement in the Host 
the usual Numerical Recipes [95| tri-diagonal solver to 
calculate it. Step 4 proceeds completely on the Device, 
with full parallelism, launching consecutive CUDA Ker- 
nels. For example, for the kernel which computes the 
total force at each point, each thread j (J = 0, . . . , L— 1) 
independently reads the pinning force F p (j) from a device 
memory array, calculates the elastic force using the in- 
stantaneous value of u(j) and its neighbors sites u(j — 1) , 
u(J + 1) (preserving periodic boundary conditions), adds 
the uniform driving force, calls a random number routine 
and adds a Langevin noise (if temperature is not zero) 
and finally writes the resulting total force acting at point 
(u, j) on a global array. With an array of forces at hand, 
the Euler evolution step is trivially parallel. It is worth 
stressing that we are not deviating from the usual (se- 
rial) dynamics at any point, since we do not modify the 
basic algorithm, we just replace dumb loops with CUDA 
kernels. Calculations of averages over x to obtain u(t), 
v(t), w 2 (t) are also executed in parallel taking advantage 
of the Thrust library provided in the CUDA Toolkit. 
S q (t) is calculated using the CUFFT library (97} , a parallel 
Fast Fourier Transform library for CUDA also provided 
in the CUDA Toolkit j98|. Copies from Device to Host 
only take place in Step 5, where we average over samples 
and print the results on a file. 

The implementation of a random number generator 
(RNG) in CPUs is a topic of study itself (for a recent 
review see [99|). For this first implementation we have 
used the Multiply- With-Carry (MWC) RNG,which has 
proven to success in classical spin systems [H, |99|. A 
great advantage of MWC is that it allows the simultane- 
ous generation of several independent random sequences 
(basically, each thread has it's own generator) and it has 
a good trade-out of performance and statistical quality. 

b. Parallelized QEW line on dynamically generated disorder 

Our second code is based on a different approach. It 
basically consists of generating the disorder dynamically. 
That is, as the line moves in the u direction we generate 



the underlying disorder V(i,j) at each discrete position 
( \u\ , j) in a consistent manner. 

For the RF case we define F p (u,x) = while 
for the linear spline RB case we interpolate V(u, x) = 
[V([u\ +l,x) — V([u\),x]/a, with a = 1, and derive the 
force at each value of u. 

The consistence of this procedure relies on the ability 
to read always the same random number V(|_uJ,x) in a 
particular place (|_uj,a;). This can be done, of course, 
storing each random number as we generate them, but 
this option would be time expensive in terms of mem- 
ory reads, which is not a good option in general, even 
less for a GPU implementation. A much more elegant 
solution come from the implementation of counter-based 
RNG. Let us extend on this point because it is crucial. 
Usual RNGs base their quality and large period on oper- 
ating with linear algebra and/or logical transformations. 
Counter-based generators, instead, have no state. These 
RNGs use simple indexes or counters as an input and a 
well complex hash function to produce a pseudo-random 
number sequence. While such generators have not re- 
ceived much attention for their use in simulations they 
are common to the functions used in secret-key cryp- 
tography. These kind of generators fit perfectly in the 
GPGPU programing framework, since they do not need 
to store a state, they allow several independent sequences 
by setting different keys, and, the most important thing, 
they have been proven to show excellent quality and per- 
formance [99l ]. 

We have i mple mented the recently introduced 
PHILOX RNG [lOOj . In the sequence of pseudo-random 
numbers with key k and counter n 

x n = f k (n), (A.l) 

we have chosen k to be a different number for each thread 
j = 0,...,L — 1 (k = G(j), where G is some bijec- 
tive function) and n to be the integer part of the in- 
stantaneous string position for that thread n = [u(j)\ . 
Though, given a position ( \u\ , x) we have a univocal way 
of determining U([uJ,a;) on-the-fly. Since the pseudo- 
cryptographic biyections used by PHILOX are designed 
to deliver essentially indistinguishable outputs for any 
value of k, different keys assure us independent random- 
number streams (up to 2 64 independent sequences for a 
64-bit key). Notice that if the string moves forward and 
backwards (which is actually the case when we have finite 
temperature), the underlying disorder already visited can 
be easily recovered without any memory record. While 
going forward and backward on a sequence is trivial for 
a counter-based RNG, it is an almost impossible task for 
usual RNGs. 

The implementation of PHILOX on GPUs encouraged 
us to propose a different approach to the disorder repre- 
sentation, saving huge amount of memory storage (within 
this approach, our system effective size is L instead of 
L x M) and reading, leading to an accelerated GPGPU 
performance. Notice as well that, if we want to preserve 
a finite period M for the disorder in the u direction, that 
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can be trivially done adding a modulo operation in the 
definition of the counter n. 

Besides this main point on the dynamical generation 
of the disorder, the algorithm structure is based on the 
first approach above. The drawback of this approach is 
probably that in principle it only let to simulate linear 
spline disorders. But workarounds can be find in future 
implementations. 

The peculiarity of our second code, besides the RNG 
used, is that it is completely implemented using Thrust 
functions instead of setting-up and launching CUDA ker- 
nels. Thrust is a parallel algorithms library (9(| which 
resembles the C++ Standard Template Library. Its high- 
level interface allows development without much knowl- 
edge of CPU's architecture. In addition, it enables porta- 
bility between CPUs and multicore CPUs. The very 
same code can be compiled with an "OpenMP" flag for 
the backend system and it will use our multicore CPU as 
the Device instead of using the GPU. 



c. Comparative performance 

In this work, our intention was to perform a first step in 
the implementation on heterogeneous hardware of driven 
clastic systems in disordered media. We are not pretend- 
ing to present a highly-optimized code, but a code that 
allows us to run simulations that where practically im- 
possible to perform before in reasonable computational 
times. Our main objective was to obtain original physi- 
cal results. The full optimization of the codes is left for 
further work or for other enthusiastic developers. 

That said, we think that we have developed good pieces 
of useful code and we would like to present their perfor- 
mance here and open them to the community for use, 
improvement and modification |lOl| . 

Testing, validation and verification arc important facts 
on any kind of software, in particular on scientific soft- 
ware. Although wc have not programed autonomous test- 
ings we have take carefully in mind this aspects at each 
step of the programming. We have tested our codes in 
the well known particular chooses of the parameters, e.g., 
without disorder, zero force, etc. Among passive security 
measures, we use assertions (boolean predicates) related 
to hardware limitations or to enforce preconditions on 
algorithm running. We also check every return condition 
of CUDA library calls and kernels. 

For our benchmarking we ran the "old" single thread, 
CPU-based, code and the two GPU-based codes pre- 
sented in the previous subsections. Each code is not to- 
tally comparable with the others in programing terms. 
Therefore, we have no intentions to report speed-ups in 
terms of hardware, but rather, we intend to highlight 
the practical speed-up that comes out when one compare 
the computing times of different software-hardware (as a 
whole) options. 

In Fig ll6l we show the mean computing time in mil- 
liseconds for a single update step as a function of the 
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FIG. 16. (Color online) The main plot shows the update step 
mean computing time T up d a te in milliseconds as a function of 
the system size L, from the execution of our serial and par- 
allel codes in a CPU + NVIDIA GTX 470 GPU platform. 
In the inset we show tl = T up d a te/L in nanoseconds and the 
practical speed-up of two cases with dashed lines, showing ac- 
celerations that range between 187 and 521 for large systems. 
Legends are LS:= Linear Spline, CS:= Cubic Spline, SP:= 
Single Precision, DP:= Double Precision, dgd: dynamically 
generated disorder. 



system size. The values shown here where obtained run- 
ning 2 15 steps of the update procedure and dividing by 
that number of steps. The update procedure includes cal- 
culating for each j = 0, . . . , (L — 1) all the instantaneous 
involved forces, and preforming an Euler integration of 
u(x). In the rest of this appendix we briefly discuss our 
interpretation of this benchmark. 

As can be seen from FigfTrJl for large enough system 
sizes the computing time of all codes scale approximately 
as ~ L, as is expected since the number of arithmetic and 
memory operations grows linearly with the number of 
elementary components in the string of our model. Wc 
compare computing times in this linear scaling regime 
first, and the we comment about the "small" system sizes. 

Th e CP U code is written in C/C++ using the 
GSL Il02l lib rary for accelerated interpolation and the 
Blitz++ [103j library for vector and matrix operations 



104j | . It is not a highly optimized code but all the com- 
monly known good practices for a serial code have been 
considered. 

Our CUDA codes are two, code "A" , the one presented 
in See l 2 al where we read the disorder from an array, and 
code "B" presented in See l 2 bl where we dynamically gen- 
erate the disorder. For code A we use pure CUDA Ker- 
nels for the update steps, while for c ode B the update 
procedure uses Thrust functions (105| . To compare dif- 
ferent cases, we have varied the types and operations 
precision (Float or Double) , used different spline interpo- 
lation schemes (Cubic/Linear) when allowed, and tested 
the codes in different hardware platforms. 

The first thing that we can stress from the benchmark- 
ing is that the Double Precision (DP) Cubic Spline (CS) 
CUDA implementation of the QEW model (code A, DP- 
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TABLE III. Mean update time execution and their related 
CPU vs. CPU+GPU practical speed-up for a system size L = 
65536 taken as an example. Platform: AMD Phenom(tm) II 
X4 955 Processor @3.2GHz, NVIDIA Tesla C2075, NVIDIA 
GTX 470. CS: Cubic Spline, LS: Linear Spline, DP: Double 
Precision, SP: Single Precision, psu: practical speed-up. All 
speed-ups are approximate numbers. 
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0.179 
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36.55 


0.098 
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~ 36 
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TABLE IV. Mean update time execution and their related 
CPU vs. CPU+GPU practical speed-up for a system size L = 
65536 taken as an example. Platform: Intel(R) Core(TM)2 
Quad CPU Q9550 @2.83GHz, NVIDIA GTX 480. CS: Cubic 
Spline, LS: Linear Spline, DP: Double Precision, SP: Single 
Precision, psu: practical speed-up. All speed-ups are approx- 
imate numbers. 



CS option), the one which we have first implemented to 
reproduce the well known results in CPU implementa- 
tions, gives us a great improvement (above 138x). The 
conservative user can choose to work with it to obtain 
completely safe results and still have a great saving in 
simulation time. 

Let us comment further options, that have been sum- 
marized in Tables IIII1 IIVI for the two different hetero- 
geneous platforms we have worked with. Here we have 
chosen L = 65536 as an example in the regime where 
execution time grows linearly with size. For the DP- 
CS case the best practical speed-up (186x) is obtained 
with the GTX 480 GPU. On each case, we compare be- 
tween C++ serial and CUDA parallel codes running on 
the same machine, one of them using a single CPU core 
and the other using a CUDA parallelized scheme and a 
GPU in addition to the CPU core. If we use Single Pre- 
cision (SP) variables and operations, which as far as we 
have seen does not affect our results, computing times 
drops around ~ 30%, representing a boost going up from 
181x to 238x on the GTX 470, for example. 

Increasing M does not represent a noticeable impact in 
the code performance but it does represent a considerably 
increase in memory demand, since we need to allocate 
arrays of size L x M. The Tesla C2075 GPU shows the 
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smaller speed-ups in all cases, even in double precision it 
does not improve the GTX 470 performance, nevertheless 
its big global memory (5.4GB) allows us to simulate large 
L x M systems. 

Another appreciable improvement in computing time 
is seen when we use a linear spline (LS) either than a 
cubic spline for the determination of the pinning poten- 
tial. GPU computing times are in this case more than 
2.4 times faster than the ones obtained for a CS. In other 
words, the linear spline approach take ~ 40% of the time 
that the cubic spline takes. This can be understand from 
the fact that CS requires each thread to perform two ad- 
ditional read operations from device global memory as 
compared to the LS, the memory reads of an array hold- 
ing the second derivative of the disordered potential at 
each point. While in CPU cores (with a good cache mem- 
ory) this is not expected to be a dramatically sensitive 
issue (we see a slowdown of ~ 17% in execution time), on 
GPUs this produce a more noticeable impact since per- 
formance is clearly governed by memory transfers. The 
averaged single update step time drops to 0.123ms for 
the GTX 470 case, while our CPU code time only drops 
to 46.46ms, representing a practical speed-up of 377x. 

This very good performance of the LS case suits also to 
our GPU code B, the one where we generate the disorder 
dynamically. In this case, besides the good performance 
of the LS spline, we save a lot of memory storage, since we 
are free of disorder arrays. This allows us to increase con- 
siderably L, as can be seen in Fig ll6l and was exploited in 
Section ITVl We claim that with these algorithm realistic 
simulations of sizes L = 10 9 can be reached using only a 
single GPU (it should have more than 4GB of memory) 
in a desktop computer. 

A word should be said about what happens for "small" 
system sizes. Bellow L = 8192 the computing times for 
our GPU codes tend to saturate to a common value for 
different L. This is because, in these cases, we are not 
taking real advantage of the GPU power. In GPGPU 
computing, memory latency issues are hidden under a 
formidable scheduling of multiple threads [gij . If threads 
are not enough, a lower bound imposed by the latency of 
global memory access dominates the computing time. 

The benchmark of our code B, shows a particularity. 
We see that for small sizes its performance is bad as com- 
pared with the code A, and also worst than the CPU code 
for systems L < 256. This is something that it is not to- 
tally clear for us, but we guess that it is related with an 
overhead produced by the Thrust functions. Since we 
do not control how those kernels are launched we cannot 
avoid this overhead. Nevertheless, it is worth notice that 
for large systems this code has roughly the same perfor- 
mance of code A and the advantage of having much less 
memory limitations. In some sense, we were expecting 
a better performance of this code since here we do not 
need to read the disorder potential from global memory 
at each step; but it seems that the PHILOX RNG call 
for each thread is as time consuming as a global memory 
access. 
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Let us remark that, while the CPU code computing 
time scales with L a with a ~ 1, as far as we have increase 
the system size all GPU codes versions are still scaling 
with a < 1, thus giving greater speedups as we increase 
the system size. Somehow, as we saturates more and 
more the GPU, massively parallelism gives back us better 
and better performance. 



To conclude, let us say that parallel implementations 
of this kind of models are doable, and available desktop 
(heterogeneous) hardware allow accelerations that start 
in the hundreds. This can be seen itself enhanced in clus- 
ter facilities of CPUs+GPUs. Therefore, a door is open 
to take up and study numerically some problems that 
were relegated because of computational impediments. 
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